Geostatistical Analysis Research Using Adult Data Set

نویسنده

  • A. S. NAVEENKUMAR
چکیده

With the great progress of microelectronics and other relating information technologies, together with the still broadening applications of computers in a vast range of businesses and industries, large databases containing mixedmode data are becoming quite commonplace. Today, large databases contain various modes of collected data related to different components of a complex real world system. Their use is not necessarily confined to classifications. Many of them may not have clearly-defined class labels or even any explicit class information at all. Indeed, there are many different reasons to determine or discover all patterns, to achieve any comprehensive analysis and understanding of the information within the data spaces. In the past, data mining or pattern discovery has by and large been developed fundamentally for categorical databases. All of the classification rules have been found from pre-labeled data samples. For a large mixed-mode database, how to discredit its continuous data into interval events is still a practical approach. If there are no class labels for the database, we have no helpful correlation references to such task actually a large relational database may contain various correlated attribute clusters. To handle these kinds of problems, we first have to partition the databases into sub-groups of attributes containing some sort of correlated relationship. This process has become known as attribute clustering, and it is an important way to reduce our search in looking for or discovering patterns Furthermore, once correlated attribute groups are obtained, from each of them, we could find the most representative attribute with the strongest interdependence with all other attributes in that cluster, and use it as a candidate like a class label of that group. That will set up a correlation attribute to drive the discretization of the other continuous data in each attribute cluster. This thesis provides the theoretical framework, the methodology and the computational system to achieve that goal. In validating the premises proposed in the Paper, extensive experiments using UCI Expository Data of various types were performed to verify each of the fine points conceived. To demonstrate the usefulness for solving real world problems, the developed methodology is applied to Adult databases from the real world.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spatial Design for Knot Selection in Knot-Based Low-Rank Models

‎Analysis of large geostatistical data sets‎, ‎usually‎, ‎entail the expensive matrix computations‎. ‎This problem creates challenges in implementing statistical inferences of traditional Bayesian models‎. ‎In addition,researchers often face with multiple spatial data sets with complex spatial dependence structures that their analysis is difficult‎. ‎This is a problem for MCMC sampling algorith...

متن کامل

Clustering Geostatistical Data

We explore and compare different methods for the spatial clustering of geostatistical data. A new methodology based on the likelihood is proposed and compared to the approach by Allard and Monestiez (1999). Both methods are compared on a heavy metal concentration data set in the Swiss Jura.

متن کامل

Spatio-Temporal Analysis of Drought Severity Using Drought Indices and Deterministic and Geostatistical Methods (Case Study: Zayandehroud River Basin)

     Drought monitoring is a fundamental component of drought risk management. It is normally performed using various drought indices that are effectively continuous functions of rainfall and other hydrometeorological variables. In many instances, drought indices are used for monitoring purposes. Geostatistical methods allow the interpolation of spatially referenced data and the prediction of v...

متن کامل

Comparison of geostatistical and random sample survey analyses of Antarctic krill acoustic data

Data from the acoustic surveys of MV SA ‘‘Agulhas’’ and FRV ‘‘Walther Herwig’’, and the 1981 RRS ‘‘John Biscoe’’ South Georgia acoustic survey were analysed by geostatistical methods. Estimates of mean density (g m) of krill and their variances are compared with published results from statistical analyses based on random sampling theory. A further high-resolution geostatistical analysis of the ...

متن کامل

Spatial Correlation of Gold and Silver Elements Concentration in Ghezel Ozen Region by Using Geostatistical Methods

‎The purpose of this study was to determine and evaluate of spatial distribution of gold and silver elements concentration by using geostatistical methods‎. ‎This study was carried out in Ghezel Ozen area for 95 samples of lithogeochemicals‎. ‎At first‎, ‎Censor data was replaced and the values of outlier's data were identified using the box-Plot and Q-Q-Plot charts and reduced by the Doerffel ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011